AITopics | mask proposal

d4eed238cf5807c6b75face996302892-Paper-Conference.pdf

Neural Information Processing SystemsApr-29-2026, 21:50:09 GMT

machine learning, natural language, segmentation, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.68)
(2 more...)

Add feedback

Mask Matching Transformer for Few-Shot Segmentation

Neural Information Processing SystemsApr-24-2026, 09:16:51 GMT

In this paper, we aim to tackle the challenging few-shot segmentation task from a new perspective. Typical methods follow the paradigm to firstly learn prototypical features from support images and then match query features in pixel-level to obtain segmentation results. However, to obtain satisfactory segments, such a paradigm needs to couple the learning of the matching operations with heavy segmentation modules, limiting the flexibility of design and increasing the learning complexity. To alleviate this issue, we propose Mask Matching Transformer (MM-Former), a new paradigm for the few-shot segmentation task. Specifically, MM-Former first uses a class-agnostic segmenter to decompose the query image into multiple segment proposals.

artificial intelligence, machine learning, segmentation, (14 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

d77b5482e38339a8068791d939126be2-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 09:25:07 GMT

data mining, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Robots (0.93)
(5 more...)

Add feedback

d4eed238cf5807c6b75face996302892-Paper-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 07:57:21 GMT

machine learning, natural language, segmentation, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.68)
(2 more...)

Add feedback

Learning Mask-aware CLIP Representations for Zero-Shot Segmentation Siyu Jiao 1,2,3, Y unchao Wei 1,2,3, Y aowei Wang

Neural Information Processing SystemsFeb-14-2026, 01:47:45 GMT

Recently, pre-trained vision-language models have been increasingly used to tackle the challenging zero-shot segmentation task.

large language model, machine learning, segmentation, (17 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Israel (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.64)

Add feedback

661caac7729aa7d8c6b8ac0d39ccbc6a-Paper-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 22:38:35 GMT

machine learning, natural language, segmentation, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Poland (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
(3 more...)

Add feedback

MaskMatchingTransformerfor Few-ShotSegmentation

Neural Information Processing SystemsFeb-7-2026, 07:09:20 GMT

To alleviate this issue, we propose Mask Matching Transformer (MM-Former), a new paradigm for the few-shot segmentation task.

artificial intelligence, machine learning, segmentation, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Learning Mask-aware CLIP Representations for Zero-Shot Segmentation

Neural Information Processing SystemsDec-26-2025, 00:17:13 GMT

Recently, pre-trained vision-language models have been increasingly used to tackle the challenging zero-shot segmentation task. Typical solutions follow the paradigm of first generating mask proposals and then adopting CLIP to classify them. To maintain the CLIP's zero-shot transferability, previous practices favour to freeze CLIP during training. However, in the paper, we reveal that CLIP is insensitive to different mask proposals and tends to produce similar predictions for various mask proposals of the same image.

learning mask-aware clip representation, mask proposal, name change, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.54)

Add feedback

The Missing Point in Vision Transformers for Universal Image Segmentation

Shahabodini, Sajjad, Mansoori, Mobina, Bayatmakou, Farnoush, Abouei, Jamshid, Plataniotis, Konstantinos N., Mohammadi, Arash

arXiv.org Artificial IntelligenceDec-10-2025

Image segmentation remains a challenging task in computer vision, demanding robust mask generation and precise classification. Recent mask-based approaches yield high-quality masks by capturing global context. However, accurately classifying these masks, especially in the presence of ambiguous boundaries and imbalanced class distributions, remains an open challenge. In this work, we introduce ViT-P, a novel two-stage segmentation framework that decouples mask generation from classification. The first stage employs a proposal generator to produce class-agnostic mask proposals, while the second stage utilizes a point-based classification model built on the Vision Transformer (ViT) to refine predictions by focusing on mask central points. ViT-P serves as a pre-training-free adapter, allowing the integration of various pre-trained vision transformers without modifying their architecture, ensuring adaptability to dense prediction tasks. Furthermore, we demonstrate that coarse and bounding box annotations can effectively enhance classification without requiring additional training on fine annotation datasets, reducing annotation costs while maintaining strong performance. Extensive experiments across COCO, ADE20K, and Cityscapes datasets validate the effectiveness of ViT-P, achieving state-of-the-art results with 54.0 PQ on ADE20K panoptic segmentation, 87.4 mIoU on Cityscapes semantic segmentation, and 63.6 mIoU on ADE20K semantic segmentation. The code and pretrained models are available at: https://github.com/sajjad-sh33/ViT-P}{https://github.com/sajjad-sh33/ViT-P.

artificial intelligence, machine learning, segmentation, (14 more...)

arXiv.org Artificial Intelligence

2505.19795

Country: Europe > Switzerland (0.28)

Genre: Research Report (1.00)

Technology: